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Abstract 

Zenith Tropospheric Delay (ZTD) due to water vapor derived from space geodetic techniques and 
numerical weather prediction simulated-reanalysis data exhibits non-linear and non-stationary prop- 
erties akin to those in the crucial geophysical signals of interest to the research community. These 
time series, once decomposed into additive (and stochastic) components, have information about the 
long term global change (the trend) and other interpretable (quasi-) periodic components such as 
seasonal cycles and noise. Such stochastic component(s) could be a function that exhibits at most 
one extremum within a data span or a monotonic function within a certain temporal span. In this 
contribution, we examine the use of the combined Ensemble Empirical Mode Decomposition (EEMD) 
and Independent Component Analysis (ICA): the EEMD-ICA algorithm to extract the independent 
local oscillatory stochastic components in the tropospheric delay derived from the European Centre for 
Medium-Range Weather Forecasts (ECMWF) over six geodetic sites (HartRAO, Hobart26, Wettzell, 
Gilcreek, Westford, and Tsukub32). The proposed methodology allows independent geophysical pro- 
cesses to be extracted and assessed. Analysis of the quality index of the Independent Components 
(ICs) derived for each cluster of local oscillatory components (also called the Intrinsic Mode Functions 
(IMFs)) for all the geodetic stations considered in the study demonstrate that they are strongly site 
dependent. Such strong dependency seems to suggest that the localized geophysical signals embedded 
in the ZTD over the geodetic sites are not correlated. Further, from the viewpoint of non-linear dy- 
namical systems, four geophysical signals — the Quasi-Biennial Oscillation (QBO) index derived from 
the NCEP/NCAR reanalysis, the Southern Oscillation Index (SOI) anomaly from NCEP, the SIDC 
monthly Sun Spot Number (SSN), and the Length of Day (LoD) —are linked to the extracted signal 
components from ZTD. Results from the synchronization analysis show that ZTD and the geophysical 
signals exhibit (albeit subtle) site dependent phase synchronization index. 


1. Introduction 

With the increasing ease of collecting and homogenizing tropospheric geodetic data, a long time 
series of geodetic data sets such as Zenith Tropospheric Delay (ZTD), Water Vapor (WV), and 
geodetic site coordinates are now available to the scientific community. As a result, the accumulated 
datasets make it possible to study the geophysical signals embedded in the data. These geophysical 
signals hold important information about atmosphere-earth system dynamics. However, sequences 
of these (geodetic) observations have noise, with unknown dependency structure embedded in the 
data. Until now, the classical techniques of analyzing such series have been subjective and unable 
to characterize the stochastic structure manifested in the observations. As a result, analyzing such 
data sets required complex, yet flexible and robust, approaches of objectively manipulating the 
data. 
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In non-linear and non-stationary time series, the stochastic global component could be a func- 
tion which exhibits at most one extremum within a data span or a monotonic function within a 
certain temporal span (Zhauhua et ah, 2007). At first, the mathematical approaches of extracting 
this component were based on fitting a simple deterministic function (e.g., a linear function) to the 
data. This method of detrending is clearly suitable for a stationary world and, therefore, may not 
be robust for real-world applications such as in tropospheric delay due to WV analyses. As the 
theory of stationary time series developed, the trend was considered as a deterministic component 
that ought to be removed from the time series in order to make it stationary (Alexandrov et al. 
2009). Nevertheless, the deterministic method of trend extraction fails to account for a) the ran- 
dom irregular components, b) the conflicts within the identification of turning points in the trend, 
and c) the unclear definition of trend, acceptable degree of smoothness, trend-cycles, or long-term 
structural effects. Some of the prominent approaches of extracting the quasi-period global compo- 
nent in the data include: a) model-based approaches (a stochastic time series model for the data, 
e.g., ARMA or ARIMA model, are assumed a priori ); b) nonparametric filtering (which does not 
require a specification of a model) e.g., the Hodrick-Prescott filter; c) Singular Spectrum Analysis, 
SSA (a nonparametric methodology that does not require a specification of time series models or 
a trend a priori ); d) wavelets; e) Independent Component Analysis (ICA); and f) Empirical Mode 
Decomposition (EMD). Some of these methods of trend extraction encompass regression analysis 
or Fourier-based filtering and are based on unjustifiable linearity and stationarity assumptions. 
Furthermore, the stochastic global component often evolves from the same or part of the same 
underlying process that generates data, and therefore the temporal structure is often linked to 
local time scales. As a result, the use of functional forms to represent the unknown embedded 
trend model could be subjective. 

The importance of estimating the slowly evolving changes in tropospheric delay due to WV lies 
in the role IWV plays in geodetic measurement as well as in climate change. As a result, estimating 
this stochastic global component is a key element of many studies involving WV variability and its 
links with the dynamics and precipitation (e.g., Zveryaev and Allan 2005; Trenberth et al. 2005; 
Schneider et al. 2009). In all these studies, the functional form of the trend is deterministically and 
subjectively determined. Since the data record under consideration is generated from non-linear 
and non-stationary underlying processes, a functional form of the trend has to be stochastic and 
therefore ought not to be pre-selected. Therefore, in order to extract the trend and other (quasi-) 
periodic components in the data, the process of detrending must be data adaptive (Zhauhua et al. 
2007; Alexandrov et al. 2008). 

In this study, we examine the use of the combined Independent Component Analysis (ICA) 
and EMD to extract the local independent components in the tropospheric delay derived from 
the ECMWF reported by Bolun et al. (2006). Aires et al. (2000) demonstrated a north-south 
equatorial Atlantic SST linkage by applying the ICA algorithm to study the variability of the 
tropical sea surface temperature (SST), as well as the links between the variability of ENSO and 
Atlantic SST. In particular, the Denoising Source Separation (DSS) algorithm (see Sarela and 
Valpola 2005) can be used to identify hidden geophysical signal components Si(t) present in the 
measurements yj(t) expressed in Equation (1). 


N 

yj(t) = (!) 

2=1 

Here, the index j runs over the independent variable measurements, which could be spatial 
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locations over the discretized observation period t. This generative model describes how the 
observed data are generated by a process of mixing the underlying signals Sj(i), which are referred 
to as the ICs. In matrix formulation, the matrix of observations Y, the matrix of the underlying 
geophysical sources S, the matrix of linear or non-linear combination of the signals or mixing 
vectors A, and the Gaussian noise 5 are mathematically described by a mixing model expressed in 
Equation (2). 


Y = AS + 5 (2) 

Lastly, the causal links between ICs from different IVS and other geophysical records, such as Sun 
Spot Number (SSN), Length of Day (LoD), Southern Oscillation Index (SOI), and Quasi-Biennial 
Oscillations (QBO), are examined by use of the angle strength of the phase angle difference between 
the series, hereafter the mean phase coherence (Mormann et al. 2000). Documenting these results 
would be useful for understanding the underlying non-stationary and non-linear geophysical signal 
structure present in the data and for assessing their dependency on each other with applications 
for the short-term and long-term variability of the atmosphere. An example of research in this 
direction has been reported by Palus and Novotna (2009). These contributions would benefit 
researchers from a diversity of fields. 

2. Data and Method 
2.1. Data 

ZTD over six geodetic IVS sites (see Table 1) spanning 1998 through 2008 were obtained 
from the Institute of Geodesy and Geophysics (IGG), Technical University of Vienna, Austria 
(http://www.hg.tuwien.ac.at/~ecmwfl/VLBI). The ZWD and IWV datasets are derived from 
ECMWF numerical model simulations (Bohm et al. 2006). 

The data under consideration were originally computed at six-hour intervals for the entire 
temporal span. Monthly val- 
ues of ZTD were computed for 
the time span under considera- 
tion. The top panel in Figure 
1 depicts the apparent season- 
ality in ZTD (40 cm of ZTD 
has been added to stations ex- 
cept Hobart26 for better visual- 
ization) in all six geodetic IVS 
stations studied. Geophysical 
data are significantly affected by 
seasonal variability. In order to 
avoid strong seasonality that could mask other important signals (with, e.g., annual, interannual, 
and interseasonal variability), ZTD data sets were seasonally adjusted by use of the Ratio-to- 
Moving Average method described by Hanke et al. (2001) and have been plotted in the bottom 
panel of Figure 1 (here only HartRAO is plotted to illustrate the effect of seasonal adjustment to 
ZTD). Additionally, we have used monthly time series anomalies of four geophysical signals (here 
the 1998-2008 time epoch is used): the QBO index derived from the NCEP/NCAR reanalysis; the 


Table 1. Geodetic IVS stations used in the study and the number of 
daily data records for the period 1998 to 2008. 


Station 

Latitude, Longitude, and Height 

# recs 

HartRAO 

-25.89°, 27.69°, 1416.12 m 

4191 

Hobart26 

-42.80°, 147.44°, 65.53 m 

4191 

Wettzell 

49.15°, 12.88°, 669.56 m 

4191 

Westford 

42.61°, 288.51°, 87.20 m 

4191 

Tsukub32 

36.10°, 140.09°, 85.09 m 

4183 

Gilcreek 

64.98°, 212.50°, 332.52 m 

4191 
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SOI anomaly from NCEP; the SIDC monthly SSN available at www.sidc.oma.be; and the LoD, 
which is an IERS product. 

2.2. Extracting Local Oscillatory Signals in ZTD by Use of EEMD-ICA 

The study of ZTD variability is of particular interest, a) because of the inherent mixture of 
different signal components that exhibit large spatio-temporal time-scales, and b) because the in- 
herent variability is governed by several complex periodic and aperiodic phenomena. In the present 
study, we propose a combination of Ensemble EMD; hereafter EEMD (see Zhaohua and Huang 
2009) and ICA to isolate the low frequency components in ZTD and thereafter investigate the 
phase dynamics of the oscillating independent components without imposing the assumption of 
linearity on them. Firstly, the data is decomposed into spectrally independent oscillatory modes 
called the Intrinsic Mode Functions (IMFs). The resulting IMFs contain inherent additive noise, 
which dominates the high frequency IMFs. As a result, only those IMFs with lower noise con- 
tamination are selected for use in the ICA step. The selection of physically significant IMFs to be 
used in the ICA stage is a two-step process. In the first step, the IMFs are selected based on the 
level of noise contamination in the sample. If the IMF] is assumed to be highly contaminated, 
then its amplitude could be used as a reference of the amplitude of the noise contaminated IMFs. 
Statistically independent IMFs are extracted by selecting only those IMFs whose global energy is 
at least 30% of the energy of I M F\ . In the second step, ICA reduces the dimensionality of the 
IMFs adaptively. Given that the number of statistically independent IMFs is smaller than the 
original IMFs, ICA generally merges the statistically dependent components into the same group 
and therefore reduces the dimensional space of IMFs. 

Furthermore, ICA is a statistical technique that decomposes a series of observations into 
a linear combination of non-Gaussian random variables that are highly independent compo- 
nents (Hyvarinen 1999). An Extended FastICA (EFICA) algorithm was used to adaptively ex- 
tract the non-linear components in the data. (FastICA is a Matlab software freely available at 
http://www.cis.hut.fi/projects/ica/.) The EFICA algorithm utilizes the generalized symmetric 
FastICA as well as the adaptive selection of the first derivative of the non-linear function called 
the contrast function (Koldovsky et al. 2006) followed by a refinement step. In general, the 
EEMD-ICA algorithm used in this study can be summarized as follows: 

1. Add iid white noise of zero-mean and and = Aoo to ZWD. Here, the noise parameter was 
arbitrarily taken as A ~ 30%. 

2. Derive the set of ZTD IMFs by applying EMD until the last IMF is a monotonic function or 
has at most one extremum. 

3. Repeat steps 1 and 2 a number of times, each time using a different randomly generated and- 
resulting in an ensemble of IMFs (120 ensembles have been used). 

4. Average over the ensemble to obtain a set of averaged IMFs. 

5. Perform EFICA on the IMFs obtained in step 4 via three steps: 

(a) run the original symmetric FastICA until convergence. In FastICA, the ICs are estimated 
based on the assumption that the mutually statistically independent components in the 
mixture exhibit non-Gaussian probability distributions. In order to assess the statistical 
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independence of the geophysical signals in ZTD, the fourth-order cumulant (kurtosis k) 
defined in (3) is used. 


«v) = E(y 4 ) - 3 {E(y 2 )f (3) 

The kurtosis utilized in computing the statistical independence of the signal present 
in ZTD could be optimized by using either the fixed-point gradient learning algorithm 
(Hyvarinen and Oja 1997) or the Newton’s method described in Nocedal and Wright 
(1999). 

(b) different non-linearities (the derivative of the score function) are adaptively selected and 
used to estimate the score functions of the components derived in step (a) above. 

(c) components derived from FastICA are fine-tuned using the non-linearities computed in 

(b) above. This results in an accurate estimation of the unmixing matrix W (=A _1 ). 

6. Slowly varying components are selected and multiplied with the mixing matrix to back- 
reconstruct the selected IMF set. 

The causal relationships among the ICs from different stations and between dominant signals of 
the QBO, SOI, SSN, and LoD are investigated by use of phase synchronization (e.g., Shelter et al. 
2007; Kreuz et al. 2007). Synchronization between dynamical systems is an active field of scientific 
and technical research. For a review and description of the various concepts of synchronization 
detection, refer to Rosenblum et al. (2001) and Pareda et al. (2005) and others therein. 

There are many different approaches of quantifying the degree of synchronization between two 
systems: linear approaches like the cross correlation or the coherence function, as well as non- 
linear measures such as mutual information (e.g., Kreuz et al. 2007 and references therein). In 
the present work, the relation between phases of two geophysical signals and derived tropospheric 
components over different temporal scales is used as a measure of the presence of interaction 
between the systems. The relation does not necessarily mean that they are synchronized. 

The concept of analytic signal as applied in signal processing is used to define the phase of 
an arbitrary signal in this work. This analytic signal approach has the advantage that the phase 
can easily be obtained from scalar time series (Rosenblum and Kurths 1998). For a univariate 
measurement x{t) the analytic signal C( t ) is defined by Equation (4): 

C(t) = x(t) + jx(t) = A% (f)e^" (t) (4) 

Here, the function x is the Hilbert transform of x(t) defined by Equation (5); 

x(t) = *-'p r E± dT (5 , 

4—oo t 

The P in Equation (5) suggests that the integral is taken in the sense of the Cauchy principal 
value. This implies that the Instantaneous Phases 4>(t) and Instantaneous Amplitude A(t) are 
uniquely defined from Equation (4). Therefore, given two interacting geophysical signals x(t) and 
y(t), the phase difference of their analytic signals given by Equation (6) can be used to derive the 
phase synchronization index defined in Equation (7). 

(frxyit) = - m(f>y (t) (6) 
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n = 


( e ^W) t = y (cos </>^(t)) f 2 + (sin^(t))? 


( 7 ) 


In Equation (6), n and m are integers defining the phase locking ratio ( n:m ). The synchro- 
nization index i] ranges from 0 (which suggests that the phase difference is uniformly distributed 
and there is no phase synchronization) to 1 (which implies that there is perfect synchronization 
and the phase difference is a constant). 

It is worth noting that the use of i] to assess the linkage (interaction) between two geophysical 
signals requires that the independent components in the signal be separated. The rationale behind 
this analysis is based on the fact that, if the variable represented by one of the series is indeed caused 
(or at least influenced) by the underlying dynamics in the second series, then the characteristic 
fluctuations in the ICs derived from the first ICs ought to be preceded by a reaction in the ICs 
computed from the second series. 


3. Results and Discussion 

Monthly ZTD delays at the IVS stations (see Table 1) derived from NWP model simulations 
have been decomposed into the corresponding IMFs. This decomposition stage separates low and 
high frequency modes. As depicted in Figure 2, the ZTD delay at all the IVS stations (except 
Tsukub32 which has fewer data records) were decomposed into seven IMFs. We have combined the 
IMFs with the lowest frequencies (here we have used IMFg and IMF7) in order to generate a weakly 
oscillating component of the series; hereafter the trend. Figure 2 illustrates a station-dependent 
trend (this is mode 5 (for Tsukub32) or mode 6 (for all other IVS stations)) with characteristic 
decadal fluctuations. Additionally, in all the IVS stations, the highly fluctuating IMF (first mode) 
has oscillating structure with a characteristic period of about one month. 


TSUKUB32 WETTZELL 



Figure 1. Top panel: tropospheric delay due to wa- Figure 2. Mode Functions (IMFs) of ZTD series over 
ter vapor at IVS stations, with 40 mm ZTD added to IVS stations, 
each station except Hobart26 for better visualization. 

[a — ► f: Tsukub32, Wettzell, Westford, Gilcreek, 

HartRAO, and Hobart26.] Bottom panel: season- 
ally adjusted and unadjusted ZTD over HartRAO. 

Higher periods are not visible because the ZTD series is monthly averaged. The second and 
third oscillation modes in all the IVS stations exhibit intra-annual and seasonal oscillation periods. 
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Figure 3. Station dependent quality index Iq of in- 
dependent components derived from zenith total de- 
lay: [a — > f : Tsukub32, Wettzell, Westford, Gilcreek, 
HartRAO, and Hobart26]. 


Figure 4. Estimated independent components in 
zenith total delay: [a — > f: Tsukub32, Wettzell, 
Westford, Gilcreek, HartRAO, and Hobart26]. 


These modes could be associated with regional or local weather processes at each IVS site. Annual 
oscillation periods (corresponding to the fourth and fifth modes) are evident in ZTD series for all 
IVS stations except Tsukub32. This oscillation pattern could be associated with relatively global 
geophysical processes such as North Atlantic tele-connection patterns and ENSO. 

In the present study, the physical causes of variability of ZTD are assessed by extracting the 
internal regularities, which include linear or non-linear mixtures of different statistically indepen- 
dent components, by use of ICA. Since we do not have information about the signals embedded in 
the ZTD data a priori, a blind source separation procedure is the natural choice (see for example 
Sarelaa and Valpola 2005). In the FastICA algorithm (Hyvarinen 1999), where a pre-estimate 
of the unmixing matrix is computed, the statistically independent IMFs are computed based on 
optimizing contrast functions, i.e., to solve for the signals embedded in the observations; a cost 
function that either maximizes the non-Gaussianity or minimizes the mutual information is for- 
mulated (Del Negro et al. 2008). The quality index I q of the clusters used to estimate the ICs in 
the FastICA algorithm are depicted in Figure 3 (here, a — ► f corresponds to Tsukub32, Wettzell, 
Westford, Gilcreek, HartRAO, and Hobart26, respectively). From Figure 3, it can be seen that 
each of the IVS stations considered in the study contains ICs with unique I q values. This finding 
is further evidence that the variability of ZTD at each IVS station is driven primarily by local 
atmospheric processes. 

In all the stations, the magnitude of I q of all the clusters is generally greater than 0.5. Fur- 
thermore, the Ig for all the IVS stations exhibit a knee at cluster 3. This is a strong evidence of 
some dependence of the geophysical signal derived from ICs at each IVS site as extracted from 
cluster 3. Therefore, the variability mode of the ZTD could be associated with a geophysical signal 
with a global geophysical origin such as the climatic tele-connection patterns. Figure 4 depicts 
the estimated ICs of each cluster (arranged in the order of the clustering quality index) of ZTD at 
IVS stations. It is evident that the variability structure of the each component is unique at each 
station. In the present work, the cluster quality index order is used to select the four components 
that are used to examine the interaction between ZTD and other geophysical signals. 

In assessing the nature of linkage between the independent signal components in ZTD with 
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the geophysical signals, ZTD components that exhibited near Gaussian distribution were selected. 
Station dependent Gaussian distributed signal components are given in Table 2. Here, a Gaussian 
shaped distribution depicts a high degree of localization, which suggests that the signal component 
does not contain spurious components. The linkage between the selected geophysical components 
(such as those selected in Table 2) is computed and tested by use of the phase synchronization index 
(see, for instance, Kreuz et al. 2007). In this methodology of non-linear dynamics, the independent 
components embedded in the mixture of ZWD and other geophysical signals are analyzed focusing 
on the phases of the fluctuations in each component. From our analysis, it is evident that the 
signal components extracted from ZTD and the geophysical signals QBO, SOI, SSN, and LoD 
show some degree of interaction among themselves (see Table 3). As tabulated in Table 3, the 
phase synchronization index at all IVS stations varies between 0.45 and 0.62 based on the phase 
locking order 1:1. These values vary across the stations that were considered in this study and, 
therefore, seem to suggest spatial dependence. As illustrated in Table 3, ZTD observations at 
Tsukub32 and Westford stations appear not to contain signal components that have dynamical 
interdependence with QBO, SOI, SSN, and LoD, i.e., the synchronization values are below 0.5. 

This signifies weak coupling. The low syn- 
chronization index between the ZTD signal Table 2. Signal components with approximate Gaus- 
components and QBO, SOI, SSN, and LoD §ian structure at six IVS stations. Column 2 has se- 

could be attributed to strong noise or irregu- lected ZTD components, and column 3 has geophysi- 

lar ZTD fluctuations (which is due to atmo- ca \ signal components, 
spheric dynamics) and non-stationarity (Botai 
et al. 2009). Further, we observe that Wettzell, 

Gilcreek, HartRAO, and Hobart exhibit syn- 
chronization indices greater than 0.5. The pres- 
ence of interaction between ZTD measurements 
and QBO, SOI, SSN, and LoD at the four IVS 
stations seems to suggest that QBO, SOI, SSN, 
and LoD signals are indeed embedded in the 
data. 

Table 3. Synchronization index derived from ZTD and geophysical signals. 


VLBI Station 

Component pairs 

Corresponding phase synchronization index 

Wettzell 

1-3; 1-4; 2-3; 2-4; 3-3; 3-4; 4-3; 4-4 

0.48; 0.58; 0.47; 0.62; 0.43; 0.61; 0.42 

Tsukub32 

1-3; 1-4; 3-3; 3-4; 4-3; 4-4 

0.47; 0.46; 0.47; 0.46; 0.47; 0.45 

Westford 

1-4; 2-4; 3-4 

0.46; 0.49; 0.46 

Gilcreek 

1-3; 2-3; 6-3 

0.55; 0.58; 0.58 

HartRAO 

1-3; 2-4; 2-3; 2-4 

0.54; 0.53; 0.57; 0.50 

Hobart 

1-3; 1-4; 2-3; 2-4; 3-3; 3-4 

0.57; 0.48; 0.57; 0.49; 0.56; 0.48 
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4. Conclusion 

The aim of this study was twofold. Firstly, the combined empirical mode decomposition and 
independent component analysis methods were used to extract signal components embedded in 
the tropospheric delay due to water vapor. Results suggest that the tropospheric delay data is a 
mixture of signal components, each with a different temporal structure. These signal components 
exhibit spatial dependence suggesting that the high modes of ZTD fluctuations are driven by 
local atmospheric dynamics. Secondly, in order to assess the linkage of these signal components 
to geophysical processes, measures of synchronization were done based on the Hilbert transform 
analytic signal approach. Although further verification on a larger set of IVS stations and other 
geophysical signals is required, our results reveal that tropospheric delay and the geophysical 
signals have some dynamical coupling. 
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